Speakers • Science of Science Summer School

Little in academia makes sense except in the light of prestige

Although meritocracy is a key organizing principle of academia, the degree to which individual and institutional differences in scholarly outputs reflect meritocratic differences is both contentious and poorly understood. In this talk, I argue that in the science of science, institutional prestige is a first-order variable, shaping nearly all aspects of what science gets done, who does it, and who gets to be scientists. In short, little in academia makes sense except in the light of prestige. I'll support this argument with evidence from three directions: (1) how prestige structures faculty hiring networks, of who hires whose graduates as faculty, (2) how institutional prestige shapes the productivity of its faculty, and (3) how prestige shapes the spread of scientific ideas. These findings suggest that the social mechanisms the drive scientific innovation are fundamentally grounded in the prestige economy of the scientific community, and efforts to develop a science of science must account for its effects in order to understand knowledge production.

Aaron Clauset Associate Professor, Colorado University, Boulder

My group's research activities are broad and multidisciplinary, and we are active participants in the network science, complex systems, computational biology, and computational social science communities. Our work generally focuses on understanding the mechanisms by which large-scale patterns emerge from the collective actions of heterogeneous individuals and on developing novel techniques for inferring such patterns and mechanisms from rich data sources. Much of this work is methodological in nature, and we actively develop novel statistical and computational methods for automatically analyzing and modeling complex phenomena in biological, social and technological systems. All of these efforts draw heavily on data analysis, machine learning, statistics, probability, algorithms and graph theory. We are particularly interested in interactions between theory and data, and the development of rigorous methods for the study of complex systems.

Improving scientific publishing – Leading by example!

The study of scientific publishing practices is a prominent topic in science of science. There is a rich body of work studying the journal publishing system, the organization of peer review, the use of impact factors and other citation metrics, the adoption of open access publishing and other open science practices, challenges around research integrity, the complex role of commercial publishers, and so on. The science of science community has a deep knowledge of many of the intricate issues related to scientific publishing. I will review some of this knowledge in my lecture. In addition, I will make a call for more reflection on our own publishing practices as science of science researchers: How do we, as members of the science of science community, disseminate our own research, and to what extent do we use our knowledge of the scientific publishing system to actively contribute to improving the system? I will argue that we need to do better: The science of science community has a responsibility to lead by example.

Ludo Waltman Professor, Leiden University

Ludo Waltman is professor of Quantitative Science Studies and deputy director at the Centre for Science and Technology Studies (CWTS) at Leiden University. Ludo leads the Quantitative Science Studies (QSS) research group at CWTS.

Quantifying the biases of scientific impact

Every day our life is made easier by efficient measures and algorithms that search and rank scientific information. Yet, these measures and algorithms have an issue: they are trained on citations, which are ingrained with human biases. Therefore the output is inherently biased too, creating inequalities, raising concerns of discrimination, even harming economic growth. In this talk, I will present research focusing on quantifying various biases in publication data. The overarching goal of this research is to uncover the bias mechanisms that, given the same quality, drive different citation trajectories, and use them (1) to create fair measures and algorithms, (2) to improve our understanding of the scientific enterprise.

Roberta Sinatra Associate Professor, ITU Copenhagen

Roberta Sinatra is Associate Professor in Data Science and Network Science at ITU Copenhagen and holds visiting positions at ISI (Turin, Italy) and Complexity Science Hub (Vienna, Austria). Starting in October 2022, she will be Professor in Social Data Science at the University of Copenhagen. She is the coordinator of the NEtwoRks, Data, and Society (NERDS) Research group and a co-lead at the AI pioneer centre in Copenhagen. Her research is at the forefront of network science, data science and computational social science. Currently, she spends particular attention on the analysis and modeling of dynamics that lead to the collective phenomenon of success, with focus on science and art, and on data-for-good applications. Roberta completed her undergraduate and graduate studies in Physics at the University of Catania, Italy, and was first a James McDonnell postdoctoral fellow, then a research faculty at the Center for Complex Network Research of Northeastern University (Boston MA, USA). Her research has been featured in The New York Times, Forbes, The Economist, The Guardian, The Washington Post, among other major media outlets. Her research has been awarded the Complex Systems Society Junior prize, the DPG Young Scientist Award for Socio- and Econophysics, and a Villum Young Investigator grant.

Democratizing Data

Data are needed to answer key questions about much of science: critical technologies, the transparency of AI, and programmatic efficiency. Yet data are often not discoverable by the communities that need to access and use them. This presentation provides a set of innovative new technologies to democratize the search for and discovery of data. It provides new ways to (i) discover how scientific data are used from a broad cross-section of primary, secondary and even tertiary sources, (ii) validate the information with key stakeholders, and (iii) communicate the value of data. https://hdsr.mitpress.mit.edu/pub/g6e8noiy/release/2

Julia Lane Professor, New York University

Julia Lane is a Professor at the NYU Wagner Graduate School of Public Service, at the NYU Center for Urban Science and Progress, and a NYU Provostial Fellow for Innovation Analytics. She cofounded the Coleridge Initiative, whose goal is to use data to transform the way governments access and use data for the social good through training programs, research projects and a secure data facility. The approach is attracting national attention, including the Commission on Evidence Based Policy and the Federal Data Strategy.

Dynamics of cross-platform attention to retracted papers

Scientific retraction has been on the rise recently. Retracted papers are frequently discussed online, enabling the broad dissemination of potentially flawed findings. Our analysis spans a nearly 10-y period and reveals that most papers exhaust their attention by the time they get retracted, meaning that retractions cannot curb the online spread of problematic papers. This is striking as we also find that retracted papers are pervasive across mediums, receiving more attention after publication than nonretracted papers even on curated platforms, such as news outlets and knowledge repositories. Interestingly, discussions on social media express more criticism toward subsequently retracted results and may thus contain early signals related to unreliable work.

Daniel Romero Associate Professor, University of Michigan

My main research interest is the empirical and theoretical analysis of Social and Information Networks. I am particularly interested in understanding the mechanisms involved in network evolution, information diffusion, and user interactions on the Web.

Building a sustainable and open Research Nexus

Few would disagree that the scholarly record should be transparent, open, and interconnected. The Research Nexus is a rich and reusable open network of relationships connecting research organizations, people, things, and actions; a scholarly record that the global community can build on forever, for the benefit of society. This is all made possible through rich metadata provided in a sustainable and open way. This talk will cover the metadata and identifiers used to build the Research Nexus, the current state of the nexus landscape, a brief introduction to interrogating Crossref metadata via APIs. as well as challenges and future plans.

Patricia Feeney Head of Metadata, Crossref

Patricia's role as Head of Metadata was created in 2018 to bring together all aspects of metadata, such as our strategy and overall vision, review and introduction of new content types, best practice around inputs (Content Registration) as well as outputs (representations through our APIs), and consulting with the community about metadata. During her 10 years at Crossref she’s helped thousands of publishers understand how to record and distribute metadata for millions of scholarly items. She’s also worked in various scholarly publishing roles and as a systems librarian and cataloger.

Quantifying hierarchy and dynamics in U.S. faculty hiring and retention

Faculty hiring and retention determine the composition of the U.S. academic workforce and directly shape educational outcomes, careers, the development and spread of ideas, and research priorities. But hiring and retention are dynamic, reflecting societal and academic priorities, generational turnover, and efforts to diversify the professoriate along gender, racial, and socioeconomic lines. Understanding these processes requires that we import diverse methods and concepts, including social hierarchies, network science, epidemiology, and demography. In this talk, we'll analyze the academic employment and doctoral education of tenure-track faculty at all PhD-granting U.S. universities over the decade 2011-2020, quantifying stark inequalities in faculty production, prestige, retention, and gender. In the process, we'll introduce numerous methods and concepts, with the goal of connecting the analysis of faculty hiring and retention with methodologies and frameworks that are broadly useful across the science of science.

Dan Larremore Assistant Professor, Colorado University, Boulder

Daniel Larremore is an assistant professor in the Department of Computer Science and the BioFrontiers Institute. His research develops statistical and inferential methods for analyzing large-scale network data, and uses those methods to solve applied problems in diverse domains, including public health and academic labor markets. In particular, his work focuses on generative models for networks, the ongoing evolution of the malaria parasite and the origins of social inequalities in academic hiring and careers. Prior to joining the CU Boulder faculty, he was an Omidyar Fellow at the Santa Fe Institute (2015-2017) and a post-doctoral fellow at the Harvard T.H. Chan School of Public Health (2012-2015). He obtained his PhD in applied mathematics from CU Boulder in 2012, and holds an undergraduate degree from Washington University in St. Louis.

Natural language understanding of causal claims and exaggerations in science publications and news

Establishing causality is one of the most important goals in scientific research. Researchers also tend to explain the implications and give practical advice based on causal findings. Therefore, the language that describes causal claims and their implications plays a crucial role in communicating research results among scientists and with the general public. However, scientists and journalists have been found to inappropriately use causal language in research publications and news articles, such as extrapolating correlational findings, often from observational studies, for causal claims and practical advice. In the past few years, our research group has developed a suite of NLP models for predicting the presence and strength levels of claims and advice given in research conclusions. Applying these models to PubMed articles, we studied characteristics of causal language use and advice giving in observational studies. We further developed a tool to link research papers with press releases, with which we were able to compare corresponding claims and detect potential exaggerations. Our work contributes to illustrating current status and longitudinal patterns of causal language use and advice giving in science literature. The computational tools can also help detect potential exaggerations in science communication.

Bei Yu Professor, Syracuse University

My research areas are Natural Language Processing and Computational Social Science. More specifically, my research focuses on using machine learning and natural language processing techniques to improve information quality, organization and access, especially in the science and health domain. I am particularly interested in linguistic patterns that characterize people’s opinions, emotions, and language styles, and their roles in information representation and sharing in science literature, news and social media. My most recent work focuses on computational modeling of misinformation in science news. For example, I developed new methods for identifying exaggerated claims in science news by extracting and comparing claims from news articles and research papers. I also collaborate with social scientists to computationally operationalize social science theories and concepts to answer research questions in a variety of domains, such as political science, business, and mass communication. My work has been supported by funding sources including NSF, IMLS, and Microsoft.

Sarah Bratt Assistant Professor, School of Information University of Arizona

Dr. Sarah Bratt is an Assistant Professor at the University of Arizona School of Information (iSchool). She holds a B.S. in Philosophy from Ithaca College and M.S. in Library and Information Science with a Data Science certificate from Syracuse University. Her research lies at the intersection of scholarly communication, research data management, and science of science. The overarching goal of her research is to understand and design for long-term research data sustainability and actionable science policy. Her research has been published in Quantitative Science Studies (QSS), Journal of Informetrics, and Scientometrics. She was a research Fellow at the Laboratory of Innovation Science at Harvard (LISH) and a Fellow at the iSchool Inclusion Institute (i3) and received multiple awards including the Masters’ prize in Library & Information Science at Syracuse University and honorable mention as a 2022 Better Scientific Software (BSSw) Fellow.

Carolyn Stein Assistant Professor, UC Berkeley

My research focuses on the economics of science and innovation. I am interested in how the incentives that scientists face shape the production of new knowledge.

Gizem Korkmaz Senior Research Scientist, Coleridge Initiative

Gizem Korkmaz a PhD-level economist with 10 years of experience using data science and conducting social, behavioral and policy research in partnership with local, state, and federal government stakeholders and industry sponsors to solve real world problems. She is a Senior Research Scientist at Coleridge Initiative. Previously, she worked as an Associate Professor at University of Virginia’s Biocomplexity Institute. She contributed extensively to network science (both mathematical modelling and empirical analysis) with applications in collaboration networks (e.g., GitHub), online social networks (e.g., Twitter, Facebook), communication networks, among others. The hallmark of her research is to integrate her background in economics with data science by blending traditional and novel data sources (e.g., social media) and methods (e.g., network analysis, machine learning) to ask how we can make data useful for people and communities. She is passionate about the development of a workforce using data science to contribute to community-based research. She was the co-director of UVA’s Data Science for the Public Good (DSPG) Young Scholars Program, and led the launch of the DSPG program in Turkey.

Chaoqun Ni Assistant Professor, Information School (iSchool), University of Wisconsin-Madison

Chaoqun Ni is an Assistant Professor at the iSchool of the University of Wisconsin-Madison, and the co-director of the Metascience Research Lab. Her research aims to identify variables that impede and facilitate the creation of a competitive scientific workforce in order to provide implications for science policy decision makings. She's currently working on projects related to gender inequalities in science.

Publishing actionable and open science of science

Every step of science-making leaves behind a digital trace of scientists’ own behaviour. Armed with new computational tools, scientists of science scour these data for signals that will help them make the scientific enterprise more efficient and more equitable. The field is no longer a niche of specialist interest, but a stream of research with clearly defined use cases in science making, funding, and science policy, complementing other metasciences. But to live up to its promise and truly make a difference, the field must become more actionable. It should not just tell us how the world of science currently is, it should also push us towards the science that could be. This applies to all stages of research, from what kinds of questions researchers ask (causal or correlational) to how they communicate their results to the public and policymakers. Being more actionable also means diversifying data sources, and publishing reproducible science. In this session, I will cover key steps in actionable science-making and its effective communication.

Arunas Radzvilavicius Editor, Nature Human Behaviour

Arunas’ interdisciplinary background is in mathematical modelling and computational approaches to genetics, psychology, economics and evolution. Before joining Nature Human Behaviour, Arunas worked with six research groups around the world, including postdoctoral positions at the University of Pennsylvania, University of Sydney and University of Zurich. He later spent a year as a junior fellow at Berlin’s Institute for Advanced Study, where he worked on building game-theoretical frameworks of morality evolution and behaviour guided by social norms, and exploring ways of testing theories using large-scale social media data sets. Arunas is passionate about openness, reproducibility and diversity in science, and the transformative opportunities that computational approaches bring to modern behavioural sciences. At Nature Human Behaviour, Arunas works with manuscripts in genetics, computational social science, science of science, evolution, and other interdisciplinary computational work.

The Technical and Societal Challengesand Opportunities of Using Machine Learning for Science

Peer review is an important part of science, aimed at providing expert and objective assessment of a manuscript. Because of many factors, including time constraints, unique expertise needs, and deference, many journals ask authors to suggest peer reviewers for their own manuscript. Previous researchers have found differing effects about this practice that might be inconclusive due to sample sizes. In this article, we analyze the association between author-suggested reviewers and review invitation, review scores, acceptance rates, and subjective review quality using a large dataset of close to 8K manuscripts from 46K authors and 21K reviewers from the journal PLOS ONE's Neuroscience section. We found that all-author-suggested review panels increase the chances of acceptance by 20 percent points vs all-editor-suggested panels while agreeing to review less often. While PLOS ONE has since ended the practice of asking for suggested reviewers, many others still use them and perhaps should consider the results presented here.

Daniel Acuña Associate Professor, Department of Computer Science, University of Colorado, Boulder

Daniel Acuña is an Associate Professor in the Department of Computer Science at the University of Colorado, Boulder. The goal of his current research is to understand decision making in science—from helping hiring committees to predict future academic success to removing the potential biases that scientists and funding agencies commit during peer review.

Lecture - Mentorship Networks

Stephen David Associate Professor of Otolaryngology - Head and Neck Surgery, School of Medicine, Oregon Health & Science University

Stephen David joined the OHSU faculty in February 2012. Before coming to OHSU, he received his Ph.D. in Bioengineering from the University of California, Berkeley in 2006 and subsequently completed postdoctoral work in the Institute for Systems Research at the University of Maryland, College Park.

LaRoyce Covington Jr. Ph.D. student, Syracuse University

Challenges, Experiments, and Computational Solutions in Peer Review

Peer review is the backbone of scientific research, and any issues therein considerably affect the progress of science, careers of researchers, billions of dollars of grants, as well as the public perception of science. We will overview various challenges in peer review -- biases, fraud, subjectivity, miscalibraiton, noise, and policy choices -- and discuss experiments and computational solutions to understand and/or address these challenges. A detailed writeup is available at http://bit.ly/PeerReviewOverview and slides at https://www.cs.cmu.edu/~nihars/tutorials/Shah_Tutorial_PeerReview_2021.pdf

Nihar B. Shah Assistant Professor, CMU

I am an assistant professor at CMU in the Machine Learning and the Computer Science departments. I work in the areas of machine learning, statistics, information theory and game theory. My current work addresses various biases and other challenges in human evaluations via principled and practical approaches. A focus application is scientific peer review, where our work has already made significant impact.

Augmenting Scientific Discovery By Harnessing Scientific Literature

With over one million papers added every year to the PubMed biomedical index alone — the explosion of scholarly knowledge presents tremendous opportunities for accelerating research across the sciences. However, while scientific knowledge is expanding with rapidity, our human minds have limited capacity for finding, assimilating and manipulating information. In this talk, I will present our recent work toward helping researchers and practitioners make use of this vast treasure trove of information. I will highlight computational methods in artificial intelligence and natural language processing that mine literature and knowledge bases to help discover new directions and solutions to problems, generate hypotheses, make predictions and decisions, and build connections across different ideas and areas. This includes a method that predicts clinical outcomes of hospital patients by mining the literature for patient-specific medical literature, NLP-powered models for predicting off-label drug uses, and novel exploratory search and recommendation engines: for discovering drug combination therapies, causal relationships, challenges and directions in COVID-19 research, and authors who inspire novel directions and collaboration opportunities in the computer science community.

Tom Hope Assistant Professor (HUJI) and Research Scientist (AI2), HUJI and Allen AI (AI2)

Tom Hope is an assistant professor at the Hebrew University of Jerusalem's School of Computer Science and Engineering, and a research scientist at The Allen Institute for AI (AI2). Prior to that, he was until recently a postdoctoral researcher at AI2 and The University of Washington, working on accelerating scientific discovery with Daniel Weld and Eric Horvitz. His work has received four best paper awards, appeared in top venues, and covered by Nature and Science. In parallel to his PhD Tom led an applied AI research team at Intel that published award-winning work. Tom was awarded the Azrieli Early Career Faculty Fellowship, selected for the 2021 Global Young Scientists Summit and 2019 Heidelberg Laureate Forum, and was a member of the KDD 2020 Best Paper Selection Committee.

Natural Language Processing: An introduction

Lecture - Bias in AI

Lizhen Liang PhD. at Syracuse University,

Lecture - Learning Spatial Data

Han Zhuang PhD. at Syracuse University,

LaRoyce Covington Jr/ PhD. at Syracuse University,